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1 Introduction 



It was conjectured by Diosi, Feldmann and Kosloff in [3], based on thermodynamical 
considerations, that the von Neumann entropy of a quantum state equal to a mixture 

R n := - (a <g> p® (n - 1} + p®a® p®^ + ... + p^ n ^ <g, a ) 

exceeds the entropy of a component asymptotically by the Umegaki relative entropy 
S(a\\p), that is, 

S(R n ) - (n - l)S(p) - S(a) - S(a\\p) (1) 

as n — > oo. Here p and a are density matrices acting on a finite dimensional Hilbert 
space. Recall that S(a) = — Trcrlogcr and 

q ( II \ _ / Tr cr( logo" — log p) if supp a < supp p 
^ +oo otherwise. 

Concerning the background of quantum entropy quantities, we refer to PUJ H2] ■ 

Apparently no exact proof of (CQ) has been published even for the classical case, al- 
though for that heuristic proof is offered in [1]. 

In the paper first an analytic proof of (pQ) is given for the case supp o < supp p, using 
an inequality between the Umegaki and the Belavkin-Staszewski relative entropies, and 
the weak law of large numbers in the quantum case. In the second part of the paper, it 
is clarified that the problem is related to the theory of classical-quantum channels. The 
essential observation is the fact that S(R n ) — (n — l)S(p) — S(a) in the conjecture is a 
Holevo quantity (classical-quantum mutual information) for a certain channel for which 
the relative entropy emerges as the capacity per unit cost. 

The two different proofs lead to two different generalizations of the conjecture. 



2 An analytic proof of the conjecture 

In this section we assume that supp a < supp p for the support projections of a and p. 
One can simply compute: 

S{R n \\p® n ) = Ti(R n \ogR n - R n \ogp® n ) 

= —S(R n ) — (n — l)Tr plog p — Tr o log p. 

Hence the identity 

S(Rn\\p® n ) = -S(Rn) + (n - l)S(p) + S(a\\p) + S(a) 
holds. It follows that the conjecture ([T]) is equivalent to the statement 

S{R n \\p® n ) -»■ as n -> oo 
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when supp a < supp p. 

Recall the Belavkin-Staszewski relative entropy 

SbsMp) = ^(wbg^V^)) = -MPV(P- 1/2 ^P- 1/2 )) 

if suppcu < supp p, where rj(t) := — tlogt, see [TJ El]. It was proved by Hiai and Petz 
that 

SH\p) < S BS (u\\p), (2) 
see [6], or Proposition 7.11 in [10J. 

Theorem 1. //supp a < supp p, then S(R n ) — {n — 1)5 (p) — 5 , (a) — > S(cr||p) as n — > oo. 

Proof: We want to use the quantum law of large numbers, see Proposition 1.17 in 
|10j . Assume that p and a are d x d density matrices and we may suppose that p is 
invertible. Due to the GNS-construction with respect to the limit <poo of the product 
states ip n {A) = Trp® n A on the n-fold tensor product Md(C)® n , n e N, all finite tensor 
products Mrf(C) (g " 1 are embedded into a von Neumann algebra Ai acting on a Hilbert 
space TC. If 7 denotes the right shift and X := p^ x l 2 ap^ x l 2 , then R n is written as 



R n = (p i/2 r (l E 7* (p 1/2 ^ n 



i=0 



By inequality (j2D, we get 

= -Tr (>" r/ ((p- 1/2 f n i?„(/T 1/2 f n ) ) 

>>^X>wW ( 3 ) 



n-1 

n 

where f2 is the cyclic vector in the GNS-construction. 
The law of large numbers gives 

n-1 



IE*) 



n 

i=0 



in the strong operator topology in B(H), since <p(X) = Tr pp 1 ^ 2 ap 1 ^ 2 = 1. 

Since the continuous functional calculus preserves the strong convergence (simply due 
to approximation by polynomials on a compact set), we obtain 



1 ™ _1 \ 

-XyPO ) -+V(I) = strongly. 



i=0 



This shows that the upper bound ([3]) converges to and the proof is complete. □ 
By the same proof one can obtain that for 

R m n := - (a® m <g> p^ n - 1] + p®a® m ® p 0(n ~ 2) + • • • + p 0(n_1) <g> a® m ) , 
n v ' 

the limit relation 

S(R m>n ) - (n - l)S(p) - mS{a) - mS{a\\p) (4) 

holds as n — > oo when m is fixed. 

In the next theorem we treat the probabilistic case in a matrix language. The proof 
includes the case when supp a < supp p is not true. Those readers who are not familiar 
with the quantum setting of the previous theorem are suggested to follow the arguments 
below. 

Theorem 2. Assume that p and a are commuting density matrices. Then S(R n ) — (n — 
l)S(p) — S(a) — > S(a\\p) as n — ► oo. 

Proof: We may assume that p = Diag(/xi, . . . , pi,0, . . . ,0) and a = Diag(Ai, . . . , Xa) 
are dx d diagonal matrices, pi, . . . , pe > and I < d. (We may consider p, a in a matrix 
algebra of bigger size if p is invertible.) If supp a < supp p, then A^ + i = ■ ■ ■ = = 0; 
this will be called the regular case. When supp a < supp p is not true, we may assume 
that Xd > and we refer to the singular case. 

The eigenvalues of R n correspond to elements («]_,..., i n ) of {1, ... , d} n : 

— (Aj^ia • • -Pi n + p^X^Piz ■ ■ ■ p in + • • ■ + p i± ■ ■ ■ ^i„_iAj n ). (5) 

We divide the eigenvalues in three different groups as follows: 

(a) A corresponds to . . . , z n ) G {1, 

(b) £? corresponds to (ii, . . . , i n ) G {1, 

(c) C is the rest of the eigenvalues. 

If the eigenvalue (jSJ) is in group A, then it is 

(W/XjJ + • • • + (W/ijJ 

^ pilPi2 ' ' ' H'in ■ 

First we compute 

k&A ii,...,i„ 
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. . . , d} n with 1 < i 1; . . . , i n < £, 

. . . , d} n which contains exactly one d, 



(AtiMi) + 1" (Ai n M„) 



P«l ' ' ' Pi, 



Below the summations are over 1 < i 1; . . . , i n < t: 



' Mil ' " ' A*ir 

n 



n 

?i,...,i„ 



1 n / 

= E E Ai 1^2 " ' l0 g^H + E Ai 1^2 * • • Min lo g 

= ~^E( ^E^ 1 ^^ + E A ^ l0g/i ^ I 



/i 



n 

fc=i \ % k i k 



= (n - l)S(p) - E Ai + 



where 



2l ,...,2. n 

Consider a probability space 

(fl,P) := ({l,...,n N ,(/ii,...,/i£) N ), 

where (fj,i, . . . , u^) N is the product of the measure on {1, . . . ,£} with the distribution 
(//i, . . . , nt). For each n G N let X n be a random variable on Q depending on the 
nth {1, ...,£} so that the value of X n at % e {1, ■■■,£} is Aj/ Uj. Then X 1; X 2 , . . . are 
identically distributed independent random variables and Q n is the expectation value of 

fX 1 + --- + X n 



\ n 

The strong law of large numbers says that 

i 



X x + ■ ■ ■ + X n 



n 



/A \ 

E(Xi) = E ( ~ ) A*i = E A * almost surely. 

i=i i=i 



Since r}{{X\ + • • • + X n )/n) is uniformly bounded, the Lebesgue bounded convergence 
theorem implies that 



i=l 

as n — > oo. 
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In the regular case Yli=i ^» = 1> Qn ~ * an d an non-zero eigenvalues are in group A. 
Hence we have 

£ £ 

S(R n ) - (n - l)S(p) - S(a) = - ^ A, log ^ + ^ A, log \ l + Q n = S(a\\p) + Q n 

i=i i=i 

and the statement is clear. 

Next we consider the singular case, when we have 

5>(«) = (n-l)S(A0 + O(l), 

KeA 

and we turn to eigenvalues in B. If the eigenvalue corresponding to (ii,...,i n ) G 
{1, . . . , d} n is in group B and i\ = d, then the eigenvalue is 

— Ad/ii 2 . . . fii n . 
n 



It follows that 



^ Arf/i^ ■ ■ ■ fi in j ^ ^ Arf/i^ ■ ■ ■ /ij n j 



«2, -,«n 

Ad v / \ i / \ Arf Ad 

= y. (^2 • • • AO log(/x i2 • • • HiJ log — 

n A — ' n n 

= — (ra - l)S(p) lo g— • 

n n n 

When i 2 = d, . . . , i n = d, we get the same quantity, so this should be multiplied with n: 

y2r ] (K) = \ d (n-l)S(p)-\ d \og-. 
z — ' n 

We make a lower estimate to the entropy of R n in such a way that we compute Y1 K v( K ) 
when k runs over A and B. It is clear now that 

S(R n )-(n-l)S(p)-S(a) > ^(«) + ^(«) - ( n - l)S(p) - S(<j) 

kGA k&B 

> X d (n- l)S(p) + A d logn + 0(1) -> +oo 
as n — > oo. □ 



3 Interpretation as capacity 

A classical-quantum channel with classical input alphabet X transfers the input x G X 
into the output W(x) = which is a density matrix acting on a Hilbert space /C. We 
restrict ourselves to the case when X is finite and /C is finite dimensional. 
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If a classical random variable X is chosen to be the input, with probability distribution 
P = {p(x) : x G X}, then the corresponding output is the quantum state px '■ = 
^2xexP( x )Px- When a measurement is performed on the output quantum system, it 
gives rise to an output random variable Y which is jointly distributed with the input X . 
If a partition of unity {F y : y G X} in B(JC) describes the measurement, then 

Prob(T = y \X = x) = Tr p x F y (x,y e X). (6) 

According to the Holevo bound, we have 

I(X A Y) := H(Y) - H(Y\X) < I(X, W) := S{p x ) - Y t p(x)S(p x ), (7) 

x£X 

which is actually a simple consequence of the monotonicity of the relative entropy un- 
der state transformation [7j, see also [TT]. I(X,W) is the so-called Holevo quantity or 
classical-quantum mutual information, and it satisfies the identity 

TT l p{z)S(p x \\p) = I(X,W) + S(p x \\p), (8) 

where p is an arbitrary density. 

The channel is used to transfer sequences from the classical alphabet; x = (xi, X2, ■ ■ ■ , x n ) G 
X n is transferred into the quantum state H^® n (x) = p x := p Xl ® p x% <g> . . - ®p Xn - A code for 
the channel W® n is defined by a subset A n C X n , which is called a codeword set. The de- 
coder is a measurement {F y : y G X"}. The probability of error is Prob(X ^ Y), where 
X is the input random variable uniformly distributed on A n and the output random 
variable is determined by (jSJ), where x and y are replaced by x and y. 

The essential observation is the fact that S(R n ) — {n — l)S(p) — S(a) in the conjecture 
is a Holevo quantity in case of a channel with input sequences (xi, x 2 , . . . , x n ) G {0, l} n 
and outputs p Xl <E> p X2 ® ■ ■ ■ ® Px n , where p = a, pi = p and the codewords are all 
sequences containing exactly one 0. More generally, we shall consider Holevo quantities 

I(A,p Q ,pi) := s(^j$^Px) -Tj-^S(p^). 

defined for any set A C {0, l} n of binary sequences of length n. 

The concept related to the conjecture we study is the channel capacity per unit cost 
which is defined next for simplicity only in the case where X = {0, 1}, the cost of a 
character G X is 1, while the cost of 1 G X is 0. 

For a memoryless channel with a binary input alphabet X = {0, 1} and an e > 0, a 
number R > is called an e-achievable rate per unit cost if for every 5 > and for any 
sufficiently large T, there exists a code of length n > T with at least e T( - R ~ 5 ^ codewords 
such that each of the codewords contains at most T 0's and the error probability is at 
most e. The largest R which is an e-achievable per unit cost for every e > is the 
channel capacity per unit cost. 
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Lemma 1. For an arbitrary A C {0, l} n , 

I(A, Po , Pl ) < c{A)S{p \\pi) 

holds, where 



Proof: Let c(x) := \{i : Xi = 0}| for x G A Since /(A, po, Pi) is a particular Holevo 
quantity /(X, W® 11 ), we can use the identity (El) to get an upper bound 

rjn E^UpT) = ^rE c ( x )^ollPi) =c(A)S(p \\ Pl ) 
' ' xeA ' ' xeA 

for I(A,p Q , Pl ). □ 

Lemma 2. If A G {0, 1}™ is a code of the channel W® n , whose probability of error {for 
some decoding scheme) does not exceed a given < e < 1, then 

(1-e) log|A|-log2</(Apo,Pi)- 

Proof: The right-hand side is a bound for the classical mutual information /(X AY) = 
H{Y) — H(Y\X), where Y is the channel output, see (I7j). Since the error probability 
Prob(X 7^ Y) is smaller than e, application of the Fano inequality (see [3]) gives 

H(X\Y) < e\og\A\ +log2. 

Therefore 

/(X A Y) = H(X) - H(X\Y) >(l-e) log \A\ - log 2, 
and the proof is complete. □ 

The above two lemmas shows that the relative entropy S(po\\pi) is an upper bound 
for the channel capacity per unit cost of the channel W(0) = p and 1^(1) = p\ with 
a binary input alphabet. In fact, assume that R > is an e-achievable rate. For every 
5 > and T > there is a code A G {0, l} n for which we get by Lemmas [T] and [2] 

TS(poWpi) > c(A)S(p \\p 1 ) > I(A,p Q ,p 1 ) 

> (1 — e) log |A| - log 2 

> (l-e)T(R-S) -log 2. 

Since T is arbitrarily large and e, 5 are arbitrarily small, R < S(po\\pi) follows. That 
S( P q\\px) equals the channel capacity per unit cost will be verified below. 

Theorem 3. Let the classical- quantum channel W : X = {0, 1} — > B{K) be defined as 
W(0) = p = o~ and W{1) — p\ = p. Assume that A n G {0, 1}™ is chosen such that 
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(a) each element x = (x\, X2, ■ ■ ■ , x n ) G A n contains at most £ copies of 0, 

(b) log \ A n \/ \ogn — > c as n — > 00, 



(c) 



, „ ... , ; • as n — > 00 

\A n \ 



for some real number c > and for some natural number £. If the random variable X n 
has a uniform distribution on A n , then 



lim (s{p Xn ) - 7-r-r V 5(p x )) = cS{a\ 



The proof of the theorem is divided into lemmas. We need the direct part of the 
so-called quantum Stein lemma obtained in [6] , see also [21 El El [T2] • 

Lemma 3. Let p and p\ be density matrices. For every 77 > and < R < S(po\\pi), 
if N is sufficiently large, then there is a projection E G B(JC® N ) such that 

a N [E] :=Tvp<§ N (I-E)<ri 

and for (3n[E] := Tr pf N ' E the estimate 

^log f3 N [E]<-R 

holds. 

Note that a at is called the error of the first kind, while (3n is the error of the second 
kind. 

Lemma 4. Assume that e > 0, < R < S(po\\pi), £ is a positive integer and the 
sequences x in A n C {0, 1}™ contain at most £ copies of 0. Let the codewords be the 
N-fold repetitions x w = (x, x, . . . , x) of the sequences x G A n . If N is the integer part 
of 

1 , 2n 

R log T 

and n is large enough, then there is a decoding scheme such that the error probability is 
smaller than e. 



Proof: We follow the probabilistic construction in [13J. Let the codewords be the iV- 
fold repetitions x. N = (x, x, . . . , x) of the sequences x G A n . The corresponding output 
density matrices act on the Hilbert space ]C® Nn = (/C® 71 )®^. We decompose this Hilbert 
space into an iV-fold product in a different way. For each 1 < i < n, let /Q be the 
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tensor product of the factors i, i + n, i + 2n, . . . , i + (N — l)n. So /C is identified with 
£1 ® K, 2 ® . . . ® /C n . 

For each 1 < z < n we perform a hypothesis testing on the Hilbert space /Q. The 
O-hypothesis is that the zth component of the actually chosen x G A n is 0. Based on 
the channel outputs at time instances z,z + n, . . . ,i + (iV — l)n, the O-hypothesis is 
tested against the alternative hypothesis that the zth component of x is 1. According 
to the quantum Stein lemma (Lemma ED, given any r\ > and < R < S(a\\p), for N 
sufficiently large, there exists a test Ei such that the probability of error of the first kind 
is smaller than rj, while the probability of error of the second kind is smaller than e~ NR . 
The projections Ei and I — Ei form a partition of unity in the Hilbert space /Q, and 
the n-fold tensor product of these commuting projection will give a partition of unity in 
K® Nn . Let y G {0, l} n and set F y := ®? =1 F y ., where F y . = Ei if y t = and F K = I- E { 
if yi = 1. Therefore, the result of decoding can be an arbitrary 0-1 sequence in {0, l} n . 

The decoding scheme gives y G {0, l} n in such a way that ?/j = if the tests accepted 
the 0-hypothesis for i and — 1 if the alternative was accepted. The error probability 
should be estimated: 

n 

Prob(r ^X|X = x)= Trpr^y= E 11^^^ 

y:y^x y:y^x i=l 

n n n 

If Xi = 0, then 

Trp^(J - F Xi ) = Tr P ra - ^) < ^ 
because it is an error of the first kind. When Xi — 1, 

Trp^(/-^) = Trpr^<e-^ 

from the error of the second kind. It follows that irj + ne~ NR is a bound for the error 
probability. The first term will be small if r] is small. The second term will be small 
if iV is large enough. If both terms are majorized by e/2, then the statement of the 
lemma holds. We can choose n so large that iV defined by the statement should be large 
enough. □ 

Proof of Theorem [3 Since Lemma [T] gives an upper bound, that is, 
limsup (s{p Xn ) - Y] 5 (Px)) < cS(a\\p), 

n-oo V \*n\ xeAn f 

it remains to prove that 

liminf (S{p xn ) — J] S{pS) > cS(a\\p). 

n^oo \ \A n \ z — i / 
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Lemma H] is about the A-times repeated input X N and describes a decoding scheme 
with error probability at most e. According to Lemma [5] we have 

(1 - e) log \A n \ - 1 < S( PxN ) - ± J2 

From the subadditivity of the entropy we have 

S( Px n) < NS(p x ) 

and 

S(p x s) = NS(p x ) 
holds due to the additivity for product. It follows that 

(1 _ £) !^_J.< s(px) _J_ E5(fti) . 



xeA n 



From the choice of N in Lemma H] we have 



log \A n \ logn < log \A r 



log n log n + log 2 — log e N 

and the lower bound is arbitrarily close to cR. Since R < S(po\\pi) was arbitrary, the 
proof is complete. □ 
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